Characterizing speaker variability using spectral envelopes of vowel sounds

نویسندگان

  • A. N. Harish
  • Doddipatla Rama Sanand
  • Srinivasan Umesh
چکیده

In this paper, we present a study to understand the relation among spectra of speakers enunciating the same sound and investigate the issue of uniform versus non-uniform scaling. There is a lot of interest in understanding this relation as speaker variability is a major source of concern in many applications including Automatic Speech Recognition (ASR). Using dynamic programming, we find mapping relations between smoothed spectral envelopes of speakers enunciating the same sound and show that these relations are not linear but have a consistent non-uniform behavior. This non-uniform behavior is also shown to vary across vowels. Through a series of experiments, we show that using the observed non-uniform relation provides better vowel normalization than just a simple linear scaling relation. All results in this paper are based on vowel data from TIMIT, Hillenbrand et al. and North Texas databases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discrimination of speaker size from syllable phrases.

The length of the vocal tract is correlated with speaker size and, so, speech sounds have information about the size of the speaker in a form that is interpretable by the listener. A wide range of different vocal tract lengths exist in the population and humans are able to distinguish speaker size from the speech. Smith et al. [J. Acoust. Soc. Am. 117, 305-318 (2005)] presented vowel sounds to ...

متن کامل

Intra-speaker variability in palatometric measures of consonant articulation.

UNLABELLED Electropalatometry is a useful clinical and research tool for measuring linguapalatal contact. The goal of this study was to examine intra-speaker variability in performance. Twenty individuals spoke VCV nonsense words using a schwa in the initial position, the 15 palatal consonants, and three corner vowels, /a/, /i/, /u/. A variability index was created to examine speaker consistenc...

متن کامل

Continuous probabilistic transform for voice conversion

Voice conversion, as considered in this paper, is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). Our contribution includes the design of a new methodology for representing the relationship between two sets of spectral envelopes. The proposed method is based on the use of a Gaussian mi...

متن کامل

Variability of Acoustic Features of Hypernasality and it’s Assessment

Hypernasality (HP) is observed across voiced phonemes uttered by Cleft-Palate (CP) speakers with defective velopharyngeal (VP) opening. HP assessment using signal processing technique is challenging due to the variability of acoustic features across various conditions such as speakers, speaking style, speaking rate, severity of HP etc. Most of the study for hypernasality (HP) assessment is base...

متن کامل

An analysis of the size information in classical formant data: Peterson and Barney (1952) revisited

Irino and Patterson (2002) have suggested the Mellin Transform as a model for vocal tract normalisation in the auditory system. In this report, we reanalyse the classical formant data reported by Peterson and Barney (1952) to see if it supports the normalisation hypothesis. The vowel formant data are clustered, quantitatively, using very general assumptions about speaker-variability. These clus...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009